论文合集 | 联邦学习 x INFOCOM'2023
本文是由白小鱼博主整理的INFOCOM 2023会议中,与联邦学习相关的论文合集及摘要翻译。
Authors: Ruiting Zhou; Jieling Yu; Ruobei Wang; Bo Li; Jiacheng Jiang; Libing Wu
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated Learning (FL) enables potentially a large number of clients to collaboratively train a global model with the coordination of a central cloud server without exposing client raw data. However, the FL model convergence performance, often measured by the job completion time, is hindered by two critical factors: non independent and identically distributed (non-IID) data across clients and the straggler effect. In this work, we propose a clustered FL framework, MCFL, to minimize the job completion time by mitigating the influence of non-IID data and the straggler effect while guaranteeing the FL model convergence performance. MCFL builds upon a two-stage operation: i) a clustering algorithm constructs clusters, each containing clients with similar computing and communications capabilities to combat the straggler effect within a cluster; ii) a deep reinforcement learning (DRL) algorithm based on soft actor-critic with discrete actions intelligently selects a subset of clients from each cluster to mitigate the impact of non-IID data, and derives the number of intra-cluster aggregation iterations for each cluster to reduce the straggler effect among clusters. Extensive testbed experiments are conducted under various configurations to verify the efficacy of MCFL. The results show that MCFL can reduce the job completion time by up to 70% compared with three state-of-the-art FL frameworks.
ISSN: 2641-9874 abstractTranslation: 联邦学习 (FL) 使大量客户能够在中央云服务器的协调下协作训练全局模型,而无需暴露客户原始数据。然而,通常通过作业完成时间来衡量的 FL 模型收敛性能受到两个关键因素的阻碍:客户端之间的非独立同分布(非 IID)数据以及落后者效应。在这项工作中,我们提出了一种集群式 FL 框架 MCFL,通过减轻非独立同分布数据和掉队效应的影响来最小化作业完成时间,同时保证 FL 模型收敛性能。MCFL 建立在两阶段操作的基础上:i) 聚类算法构建集群,每个集群包含具有相似计算和通信能力的客户端,以对抗集群内的落后者效应;ii)基于具有离散动作的软演员批评家的深度强化学习(DRL)算法,可以从每个集群中智能地选择客户端子集,以减轻非独立同分布数据的影响,并导出每个客户端的集群内聚合迭代次数集群,以减少集群之间的掉队效应。在各种配置下进行了大量的测试台实验,以验证 MCFL 的功效。结果表明,与三种最先进的 FL 框架相比,MCFL 可以将作业完成时间缩短多达 70%。
Authors: Yuxi Zhao; Xiaowen Gong; Shiwen Mao
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning (FL) has recently emerged as a promising paradigm that trains machine learning (ML) models on clients' devices in a distributed manner without the need of transmitting clients' data to the FL server. In many applications of ML (e.g., image classification), the labels of training data need to be generated manually by human agents (e.g., recognizing and annotating objects in an image), which are usually costly and error-prone. In this paper, we study FL with crowdsourced data labeling where the local data of each participating client of FL are labeled manually by the client. We consider the strategic behavior of clients who may not make desired effort in their local data labeling and local model computation (quantified by the mini-batch size used in the stochastic gradient computation), and may misreport their local models to the FL server. We first characterize the performance bounds on the training loss as a function of clients' data labeling effort, local computation effort, and reported local models, which reveal the impacts of these factors on the training loss. With these insights, we devise Labeling and Computation Effort and local Model Elicitation (LCEME) mechanisms which incentivize strategic clients to make truthful efforts as desired by the server in local data labeling and local model computation, and also report true local models to the server. The truthful design of the LCEME mechanism exploits the non-trivial dependence of the training loss on clients' hidden efforts and private local models, and overcomes the intricate coupling in the joint elicitation of clients' efforts and local models. Under the LCEME mechanism, we characterize the server’s optimal local computation effort assignments and analyze their performance. We evaluate the proposed FL algorithms with crowdsourced data labeling and the LCEME mechanism for the MNIST-based hand-written digit classification. The results corroborate the improved learning accuracy and cost-effectiveness of the proposed approaches.
ISSN: 2641-9874 abstractTranslation: 联邦学习 (FL) 最近成为一种有前途的范例,它以分布式方式在客户端设备上训练机器学习 (ML) 模型,而无需将客户端数据传输到 FL 服务器。在机器学习的许多应用中(例如图像分类),训练数据的标签需要由人类代理手动生成(例如识别和注释图像中的对象),这通常成本高昂且容易出错。在本文中,我们通过众包数据标记来研究 FL,其中每个参与 FL 的客户端的本地数据由客户端手动标记。我们考虑客户的策略行为,他们可能没有在本地数据标记和本地模型计算(通过随机梯度计算中使用的小批量大小进行量化)方面做出预期的努力,并且可能会向 FL 服务器错误报告其本地模型。我们首先将训练损失的性能界限描述为客户数据标记工作、本地计算工作和报告的本地模型的函数,揭示了这些因素对训练损失的影响。有了这些见解,我们设计了标签和计算工作以及本地模型启发(LCEME)机制,激励战略客户在本地数据标签和本地模型计算方面按照服务器的要求做出真实的努力,并向服务器报告真实的本地模型。LCEME 机制的真实设计利用了训练损失对客户隐藏努力和私有本地模型的非平凡依赖性,并克服了客户努力和本地模型联邦引发中的复杂耦合。在 LCEME 机制下,我们描述了服务器的最佳本地计算工作量分配并分析了它们的性能。我们使用众包数据标记和基于 MNIST 的手写数字分类的 LCEME 机制来评估所提出的 FL 算法。结果证实了所提出的方法提高了学习准确性和成本效益。
Authors: Chen Zhang; Boyang Zhou; Zhiqiang He; Zeyuan Liu; Yanjiao Chen; Wenyuan Xu; Baochun Li
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning is exposed to model poisoning attacks as compromised clients may submit malicious model updates to pollute the global model. To defend against such attacks, robust aggregation rules are designed for the centralized server to winnow out outlier updates, and to significantly reduce the effectiveness of existing poisoning attacks. In this paper, we develop an advanced model poisoning attack against defensive aggregation rules. In particular, we exploit the catastrophic forgetting phenomenon during the process of continual learning to destroy the memory of the global model. Our proposed framework, called Oblivion, features two special components. The first component prioritizes the weights that have the most influence on the model accuracy for poisoning, which induces a more significant degradation on the global model than equally perturbing all weights. The second component smooths malicious model updates based on the number of selected compromised clients in the current round, adjusting the degree of poisoning to suit the dynamics of each training round. We implement a fully-functional prototype of Oblivion in PLATO, a real-world scalable federated learning framework. Our extensive experiments over three datasets demonstrate that Oblivion can boost the attack performance of model poisoning attacks against unknown defensive aggregation rules.
ISSN: 2641-9874 abstractTranslation: 联邦学习容易遭受模型中毒攻击,因为受感染的客户端可能会提交恶意模型更新来污染全局模型。为了防御此类攻击,为集中式服务器设计了强大的聚合规则,以筛选出异常更新,并显着降低现有中毒攻击的有效性。在本文中,我们开发了一种针对防御聚合规则的高级模型中毒攻击。特别是,我们利用持续学习过程中的灾难性遗忘现象来破坏全局模型的记忆。我们提出的框架称为 Oblivion,具有两个特殊组件。第一个组件优先考虑对中毒模型精度影响最大的权重,这比同等扰动所有权重会导致全局模型更显着的退化。第二个组件根据本轮中选定的受感染客户端的数量平滑恶意模型更新,调整中毒程度以适应每轮训练的动态。我们在 PLATO(一个现实世界的可扩展联邦学习框架)中实现了 Oblivion 的全功能原型。我们对三个数据集进行的广泛实验表明,Oblivion 可以提高针对未知防御聚合规则的模型中毒攻击的攻击性能。
Notes:
[CODE](https://github.com/TL-System/plato/)
Authors: Tao Wu; Yuben Qu; Chunsheng Liu; Yuqian Jing; Feiyu Wu; Haipeng Dai; Chao Dong; Jiannong Cao
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning (FL) has been proposed as a promising distributed learning paradigm to realize edge artificial intelligence (AI) without revealing the raw data. Nevertheless, it would incur inevitable costs in terms of training latency and energy consumption, due to periodical communication between user equipments (UEs) and the geographically remote central parameter server. Thus motivated, we study the joint edge aggregation and association problem to minimize the total cost, where the model aggregation over multiple cells just happens at the network edge. After proving its hardness with complex coupled variables, we transform it into a set function optimization problem and prove the objective function is neither submodular nor supermodular, which further complicates the problem. To tackle this difficulty, we first split it into multiple edge association subproblems, where the optimal solution to the computation resource allocation can be efficiently obtained in the closed form. We then construct a substitute function with the supermodularity and provable upper bound. On this basis, we reformulate an equivalent set function minimization problem under a matroid base constraint. We then propose an approximation algorithm to the original problem based on the two-stage search strategy with theoretical performance guarantee. Both extensive simulations and field experiments are conducted to validate the effectiveness of our proposed solution.
ISSN: 2641-9874 abstractTranslation: 联邦学习(FL)被认为是一种有前景的分布式学习范式,可以在不泄露原始数据的情况下实现边缘人工智能(AI)。然而,由于用户设备(UE)和地理上远程的中央参数服务器之间的定期通信,在训练延迟和能耗方面会产生不可避免的成本。因此,我们研究了联邦边缘聚合和关联问题,以最小化总成本,其中多个单元上的模型聚合仅发生在网络边缘。在用复杂耦合变量证明其硬度后,我们将其转化为集合函数优化问题,并证明目标函数既不是子模也不是超模,这进一步使问题复杂化。为了解决这个困难,我们首先将其分解为多个边关联子问题,其中可以以封闭形式有效地获得计算资源分配的最优解。然后我们构造一个具有超模性和可证明上限的替代函数。在此基础上,我们重新表述了拟阵基约束下的等效集合函数最小化问题。然后,我们提出了一种基于具有理论性能保证的两阶段搜索策略的原始问题的近似算法。进行了广泛的模拟和现场实验来验证我们提出的解决方案的有效性。
CriticalFL: A Critical Learning Periods Augmented Client Selection Framework for Efficient Federated Learning
Authors: Gang Yan; Hao Wang; Xu Yuan; Jian Li
Conference : Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Url: https://dl.acm.org/doi/10.1145/3580305.3599293
Abstract: Federated learning (FL) is a distributed optimization paradigm that learns from data samples distributed across a number of clients. Adaptive client selection that is cognizant of the training progress of clients has become a major trend to improve FL efficiency but not yet well-understood. Most existing FL methods such as FedAvg and its state-of-the-art variants implicitly assume that all learning phases during the FL training process are equally important. Unfortunately, this assumption has been revealed to be invalid due to recent findings on critical learning periods (CLP), in which small gradient errors may lead to an irrecoverable deficiency on final test accuracy. In this paper, we develop CriticalFL, a CLP augmented FL framework to reveal that adaptively augmenting exiting FL methods with CLP, the resultant performance is significantly improved when the client selection is guided by the discovered CLP. Experiments based on various machine learning models and datasets validate that the proposed CriticalFL framework consistently achieves an improved model accuracy while maintains better communication efficiency as compared to state-of-the-art methods, demonstrating a promising and easily adopted method for tackling the heterogeneity of FL training.
abstractTranslation: 联邦学习 (FL) 是一种分布式优化范例,它从分布在多个客户端的数据样本中学习。了解客户训练进度的自适应客户选择已成为提高 FL 效率的主要趋势,但尚未得到充分理解。大多数现有的 FL 方法(例如 FedAvg 及其最先进的变体)隐含地假设 FL 训练过程中的所有学习阶段都同等重要。不幸的是,由于最近对关键学习期(CLP)的发现,这一假设被证明是无效的,其中小的梯度误差可能会导致最终测试准确性的不可挽回的缺陷。在本文中,我们开发了 CriticalFL,一个 CLP 增强 FL 框架,以揭示利用 CLP 自适应增强现有 FL 方法,当客户端选择由发现的 CLP 指导时,最终性能得到显着提高。基于各种机器学习模型和数据集的实验验证了所提出的 CriticalFL 框架始终能够提高模型精度,同时与最先进的方法相比保持更好的通信效率,展示了一种有前途且易于采用的方法来解决异构性问题联邦学习训练。
Notes:
PDF (https://arxiv.org/abs/2109.05613)
Authors: Xiong Wang; Yuxin Chen; Yuqing Li; Xiaofei Liao; Hai Jin; Bo Li
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning (FL) enables massive clients to collaboratively train a global model by aggregating their local updates without disclosing raw data. Communication has become one of the main bottlenecks that prolongs the training process, especially under large model variances due to skewed data distributions. Existing efforts mainly focus on either single momentum-based gradient descent, or random client selection for potential variance reduction, yet both often lead to poor model accuracy and system efficiency. In this paper, we propose FedMoS, a communication-efficient FL framework with coupled double momentum-based update and adaptive client selection, to jointly mitigate the intrinsic variance. Specifically, FedMoS maintains customized momentum buffers on both server and client sides, which track global and local update directions to alleviate the model discrepancy. Taking momentum results as input, we design an adaptive selection scheme to provide a proper client representation during FL aggregation. By optimally calibrating clients’ selection probabilities, we can effectively reduce the sampling variance, while ensuring unbiased aggregation. Through a rigid analysis, we show that FedMoS can attain the theoretically optimal \mathcalO\left(T^ - 2/3\right) convergence rate. Extensive experiments using real-world datasets further validate the superiority of FedMoS, with 58%-87% communication reduction for achieving the same target performance compared to state-of-the-art techniques.
ISSN: 2641-9874 abstractTranslation: 联邦学习 (FL) 使大量客户能够通过聚合本地更新来协作训练全局模型,而无需披露原始数据。通信已成为延长训练过程的主要瓶颈之一,特别是在由于数据分布不均而导致模型方差较大的情况下。现有的工作主要集中在基于单一动量的梯度下降或随机客户端选择以减少潜在的方差,但这两种方法通常都会导致模型精度和系统效率较差。在本文中,我们提出了 FedMoS,这是一种高效通信的 FL 框架,具有耦合的基于双动量的更新和自适应客户端选择,以共同减轻内在方差。具体来说,FedMoS 在服务器和客户端上维护定制的动量缓冲区,跟踪全局和本地更新方向以减轻模型差异。以动量结果作为输入,我们设计了一种自适应选择方案,以在 FL 聚合期间提供适当的客户端表示。通过优化校准客户的选择概率,我们可以有效减少抽样方差,同时确保聚合的无偏性。通过严格的分析,我们表明 FedMoS 可以达到理论上最优的 \mathcalO\left(T^ - 2/3\right) 收敛速度。使用真实数据集进行的大量实验进一步验证了 FedMoS 的优越性,与最先进的技术相比,实现相同目标性能的通信量减少了 58%-87%。
Notes:
[[PDF](https://wangxionghome.github.io/MainFL-TR.pdf)]
[CODE](https://github.com/Distributed-Learning-Networking-Group/FedMoS/)
Authors: Shiqiang Wang; Jake Perazzone; Mingyue Ji; Kevin S. Chan
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning (FL) enables distributed model training from local data collected by users. In distributed systems with constrained resources and potentially high dynamics, e.g., mobile edge networks, the efficiency of FL is an important problem. Existing works have separately considered different configurations to make FL more efficient, such as infrequent transmission of model updates, client subsampling, and compression of update vectors. However, an important open problem is how to jointly apply and tune these control knobs in a single FL algorithm, to achieve the best performance by allowing a high degree of freedom in control decisions. In this paper, we address this problem and propose FlexFL – an FL algorithm with multiple options that can be adjusted flexibly. Our FlexFL algorithm allows both arbitrary rates of local computation at clients and arbitrary amounts of communication between clients and the server, making both the computation and communication resource consumption adjustable. We prove a convergence upper bound of this algorithm. Based on this result, we further propose a stochastic optimization formulation and algorithm to determine the control decisions that (approximately) minimize the convergence bound, while conforming to constraints related to resource consumption. The advantage of our approach is also verified using experiments.
ISSN: 2641-9874 abstractTranslation: 联邦学习 (FL) 支持根据用户收集的本地数据进行分布式模型训练。在资源受限和潜在高动态性的分布式系统中,例如移动边缘网络,FL 的效率是一个重要问题。现有的工作分别考虑了不同的配置以使 FL 更加高效,例如模型更新的不频繁传输、客户端子采样和更新向量的压缩。然而,一个重要的开放问题是如何在单个 FL 算法中联邦应用和调整这些控制旋钮,通过允许控制决策的高度自由来实现最佳性能。在本文中,我们针对这个问题提出了FlexFL——一种具有多个选项、可以灵活调整的FL算法。我们的 FlexFL 算法允许客户端进行任意速率的本地计算,以及客户端与服务器之间任意数量的通信,从而使计算和通信资源消耗均可调节。我们证明了该算法的收敛上限。基于这一结果,我们进一步提出了一种随机优化公式和算法,以确定(大约)最小化收敛界限的控制决策,同时符合与资源消耗相关的约束。我们的方法的优点也通过实验得到了验证。
Notes:
[PDF](https://arxiv.org/abs/2212.08496)
[CODE](https://github.com/IBM/flexfl)
Authors: Junhao Wang; Lan Zhang; Yihang Cheng; Shaoang Li; Hong Zhang; Dongbo Huang; Xu Lan
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Vertical federated learning (VFL) enables multiple participants with different data features and the same sample ID space to collaboratively train a model in a privacy-preserving way. However, the high computational and communication overheads hinder the adoption of VFL in many resource-limited or delay-sensitive applications. In this work, we focus on reducing the communication cost and delay incurred by the transmission of intermediate results in VFL model serving. We investigate the inference results, and find that a large portion of test samples can be predicted correctly by the active party alone, thus the corresponding communication for federated inference is dispensable. Based on this insight, we theoretically analyze the "dispensable communication" and propose a novel tunable vertical federated learning framework, named TVFL, to avoid "dispensable communication" in model serving as much as possible. TVFL can smartly switch between independent inference and federated inference based on the features of the input sample. We further reveal that such tunability is highly related to the importance of participants’ features. Our evaluations on seven datasets and three typical VFL models show that TVFL can save 57.6% communication cost and reduce 57.1% prediction latency with little performance degradation.
ISSN: 2641-9874 abstractTranslation: 纵向联邦学习 (VFL) 使具有不同数据特征和相同样本 ID 空间的多个参与者能够以保护隐私的方式协作训练模型。然而,高计算和通信开销阻碍了 VFL 在许多资源有限或延迟敏感的应用中的采用。在这项工作中,我们的重点是减少 VFL 模型服务中中间结果传输所产生的通信成本和延迟。我们对推理结果进行了调查,发现很大一部分测试样本仅由主动方就可以正确预测,因此联邦推理的相应通信是可有可无的。基于这一见解,我们从理论上分析了“可有可无的通信”,并提出了一种新颖的可调谐纵向联邦学习框架,称为 TVFL,以尽可能避免模型服务中的“可有可无的通信”。TVFL可以根据输入样本的特征,在独立推理和联邦推理之间智能切换。我们进一步揭示,这种可调性与参与者特征的重要性高度相关。我们对七个数据集和三个典型 VFL 模型的评估表明,TVFL 可以节省 57.6% 的通信成本,减少 57.1% 的预测延迟,而性能下降很少。
Authors: Haozhao Wang; Wenchao Xu; Yunfeng Fan; Ruixuan Li; Pan Zhou
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated Learning enables collaboratively model training among a number of distributed devices with the coordination of a centralized server, where each device alternatively performs local gradient computation and communication to the server. FL suffers from significant performance degradation due to the excessive communication delay between the server and devices, especially when the network bandwidth of these devices is limited, which is common in edge environments. Existing methods overlap the gradient computation and communication to hide the communication latency to accelerate the FL training. However, the overlapping can also lead to an inevitable gap between the local model in each device and the global model in the server that seriously restricts the convergence rate of learning process. To address this problem, we propose a new overlapping method for FL, AOCC-FL, which aligns the local model with the global model via calibrated compensation such that the communication delay can be hidden without deteriorating the convergence performance. Theoretically, we prove that AOCC-FL admits the same convergence rate as the non-overlapping method. On both simulated and testbed experiments, we show that AOCC-FL achieves a comparable convergence rate relative to the non-overlapping method while outperforming the state-of-the-art overlapping methods.
ISSN: 2641-9874 abstractTranslation: 联邦学习可以在中央服务器的协调下在多个分布式设备之间进行协作模型训练,其中每个设备交替执行本地梯度计算并与服务器通信。由于服务器和设备之间的通信延迟过大,特别是当这些设备的网络带宽有限时,FL 的性能会显着下降,这在边缘环境中很常见。现有方法将梯度计算和通信重叠,以隐藏通信延迟,从而加速 FL 训练。然而,重叠也会导致每个设备中的本地模型与服务器中的全局模型之间不可避免地存在差距,严重限制了学习过程的收敛速度。为了解决这个问题,我们提出了一种新的 FL 重叠方法,AOCC-FL,该方法通过校准补偿将局部模型与全局模型对齐,从而可以隐藏通信延迟而不会降低收敛性能。理论上,我们证明了AOCC-FL 与非重叠方法具有相同的收敛速度。在模拟和测试台实验中,我们表明 AOCC-FL 实现了与非重叠方法相当的收敛速度,同时优于最先进的重叠方法。
Authors: Haolin Wang; Xuefeng Liu; Jianwei Niu; Shaojie Tang
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning (FL) is an emerging paradigm of distributed machine learning. However, when applied to wireless network scenarios, FL usually suffers from high communication cost because clients need to transmit their updated gradients to a server in every training round. Although many gradient compression techniques like sparsification and quantization are proposed, they compress clients’ gradients independently, without considering the correlations among gradients. In this paper, we propose SVDFed, a collaborative gradient compression framework for FL. SVDFed utilizes Singular Value Decomposition (SVD) to find a few basis vectors, whose linear combination can well represent clients’ gradients at a certain round. Due to the correlations among gradients, these basis vectors can still well approximate new gradients in many subsequent rounds. With the help of basis vectors, clients only need to upload the coefficients of the linear combination to the server, which greatly reduces communication cost. In addition, SVDFed leverages the classical PID (Proportional, Integral, Derivative) control to determine the proper time to update basis vectors to maintain their representation ability. Through experiments, we demonstrate that SVDFed outperforms existing gradient compression methods in FL. For example, compared to a popular gradient quantization method QSGD, SVDFed can reduce the communication overhead by 66 % and pending time by 99 %.
ISSN: 2641-9874 abstractTranslation: 联邦学习(FL)是分布式机器学习的新兴范例。然而,当应用于无线网络场景时,FL通常会面临较高的通信成本,因为客户端需要在每轮训练中将其更新的梯度传输到服务器。尽管提出了许多梯度压缩技术,例如稀疏化和量化,但它们独立地压缩客户端的梯度,没有考虑梯度之间的相关性。在本文中,我们提出了 SVDFed,一种用于 FL 的协作梯度压缩框架。SVDFed利用奇异值分解(SVD)来找到一些基向量,它们的线性组合可以很好地代表客户在某一轮的梯度。由于梯度之间的相关性,这些基向量在后续的许多轮中仍然可以很好地逼近新的梯度。借助基向量,客户端只需将线性组合的系数上传到服务器,大大降低了通信成本。此外,SVDFed 利用经典的 PID(比例、积分、微分)控制来确定更新基本向量的适当时间,以保持其表示能力。通过实验,我们证明 SVDFed 优于 FL 中现有的梯度压缩方法。例如,与流行的梯度量化方法 QSGD 相比,SVDFed 可以减少 66% 的通信开销和 99% 的等待时间。
Authors: Fei Wang; Lei Jiao; Konglin Zhu; Xiaojun Lin; Lei Li
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Cloud-edge systems are important Emergency Demand Response (EDR) participants that help maintain power grid stability and demand-supply balance. However, as users are increasingly executing artificial intelligence (AI) workloads in cloud-edge systems, existing EDR management has not been designed for AI workloads and thus faces the critical challenges of the complex trade-offs between energy consumption and AI model accuracy, the degradation of model accuracy due to AI model quantization, the restriction of AI training deadlines, and the uncertainty of AI task arrivals. In this paper, targeting Federated Learning (FL), we design an auction-based approach to overcome all these challenges. We firstly formulate a nonlinear mixed-integer program for the long-term social welfare optimization. We then propose a novel algorithmic approach that generates candidate training schedules, reformulates the original problem into a new schedule selection problem, and solves this new problem using an online primal-dual-based algorithm, with a carefully embedded payment design. We further rigorously prove that our approach achieves truthfulness and individual rationality, and leads to a constant competitive ratio for the long-term social welfare. Via extensive evaluations with real-world data and settings, we have validated the superior practical performance of our approach over multiple alternative methods.
ISSN: 2641-9874 abstractTranslation: 云边缘系统是紧急需求响应(EDR)的重要参与者,有助于维持电网稳定和供需平衡。然而,随着用户越来越多地在云边缘系统中执行人工智能 (AI) 工作负载,现有的 EDR 管理并不是针对 AI 工作负载而设计的,因此面临着能源消耗和 AI 模型准确性之间复杂权衡的关键挑战。由于AI模型量化、AI训练期限的限制以及AI任务到达的不确定性而导致模型精度下降。在本文中,我们针对联邦学习(FL),设计了一种基于拍卖的方法来克服所有这些挑战。我们首先制定了一个用于长期社会福利优化的非线性混合整数规划。然后,我们提出了一种新颖的算法方法,可以生成候选训练计划,将原始问题重新表述为新的计划选择问题,并使用基于原始对偶的在线算法和仔细嵌入的支付设计来解决这个新问题。我们进一步严格证明我们的方法实现了真实性和个体理性,并导致长期社会福利的恒定竞争比。通过对真实世界数据和设置的广泛评估,我们验证了我们的方法相对于多种替代方法的卓越实际性能。
Authors: Fei Wang; Ethan Hugh; Baochun Li
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: With increasing concerns on privacy leakage from gradients, a variety of attack mechanisms emerged to recover private data from gradients at an honest-but-curious server, which challenged the primary advantage of privacy protection in federated learning. However, we cast doubt upon the real impact of these gradient attacks on production federated learning systems. By taking away several impractical assumptions that the literature has made, we find that gradient attacks pose a limited degree of threat to the privacy of raw data.Through a comprehensive evaluation on existing gradient attacks in a federated learning system with practical assumptions, we have systematically analyzed their effectiveness under a wide range of configurations. We present key priors required to make the attack possible or stronger, such as a narrow distribution of initial model weights, as well as inversion at early stages of training. We then propose a new lightweight defense mechanism that provides sufficient and self-adaptive protection against time-varying levels of the privacy leakage risk throughout the federated learning process. As a variation of gradient perturbation method, our proposed defense, called Outpost, selectively adds Gaussian noise to gradients at each update iteration according to the Fisher information matrix, where the level of noise is determined by the privacy leakage risk quantified by the spread of model weights at each layer. To limit the computation overhead and training performance degradation, Outpost only performs perturbation with iteration-based decay. Our experimental results demonstrate that Outpost can achieve a much better tradeoff than the state-of-the-art with respect to convergence performance, computational overhead, and protection against gradient attacks.
ISSN: 2641-9874 abstractTranslation: 随着人们对梯度隐私泄露问题的日益关注,出现了各种攻击机制,以在诚实但好奇的服务器上从梯度中恢复隐私数据,这对联邦学习中隐私保护的主要优势提出了挑战。然而,我们对这些梯度攻击对生产联邦学习系统的真正影响表示怀疑。通过剔除文献中的一些不切实际的假设,我们发现梯度攻击对原始数据的隐私构成有限程度的威胁。通过结合实际假设对联邦学习系统中现有的梯度攻击进行综合评估,我们系统地分析了它们在各种配置下的有效性。我们提出了使攻击成为可能或更强所需的关键先验,例如初始模型权重的狭窄分布,以及训练早期阶段的反转。然后,我们提出了一种新的轻量级防御机制,该机制可以在整个联邦学习过程中针对随时间变化的隐私泄露风险级别提供充分且自适应的保护。作为梯度扰动方法的一种变体,我们提出的防御方法称为 Outpost,根据 Fisher 信息矩阵在每次更新迭代时有选择地向梯度添加高斯噪声,其中噪声水平由模型传播量化的隐私泄露风险决定每层的权重。为了限制计算开销和训练性能下降,Outpost 仅通过基于迭代的衰减来执行扰动。我们的实验结果表明,在收敛性能、计算开销和梯度攻击防护方面,Outpost 可以比最先进的技术实现更好的权衡。
Notes:
infocom’23 Best Paper Award (https://weibo.com/2174209470/MBt1Mofxv)
Authors: Ming Tang; Vincent W. S. Wong
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: In federated learning, clients cooperatively train a global model by training local models over their datasets under the coordination of a central server. However, clients may sometimes be unavailable for training due to their network connections and energy levels. Considering the highly non-independent and identically distributed (non-IID) degree of the clients’ datasets, the local models of the available clients being sampled for training may not represent those of all other clients. This is referred as system induced bias. In this work, we quantify the system induced bias due to time-varying client availability. The theoretical result shows that this bias occurs independently of the number of available clients and the number of clients being sampled in each training round. To address system induced bias, we propose a FedSS algorithm by incorporating stratified sampling and prove that the proposed algorithm is unbiased. We quantify the impact of system parameters on the algorithm performance and derive the performance guarantee of our proposed FedSS algorithm. Theoretical and experimental results on CIFAR-10 and MNIST datasets show that our proposed FedSS algorithm outperforms several benchmark algorithms by up to 5.1 times in terms of the algorithm convergence rate.
ISSN: 2641-9874 abstractTranslation: 在联邦学习中,客户端在中央服务器的协调下通过其数据集训练本地模型来合作训练全局模型。然而,客户有时可能由于其网络连接和精力水平而无法参加培训。考虑到客户端数据集的高度非独立性和同分布(非 IID)程度,被采样用于训练的可用客户端的本地模型可能并不代表所有其他客户端的本地模型。这称为系统引起的偏差。在这项工作中,我们量化了由于客户端可用性随时间变化而引起的系统偏差。理论结果表明,这种偏差的发生与可用客户端的数量以及每轮训练中采样的客户端数量无关。为了解决系统引起的偏差,我们提出了一种结合分层采样的 FedSS 算法,并证明该算法是无偏差的。我们量化了系统参数对算法性能的影响,并得出了我们提出的 FedSS 算法的性能保证。CIFAR-10 和 MNIST 数据集上的理论和实验结果表明,我们提出的 FedSS 算法在算法收敛速度方面优于多种基准算法高达 5.1 倍。
Authors: Ningxin Su; Baochun Li
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Thanks to regulatory policies such as the General Data Protection Regulation (GDPR), it is essential to provide users with the right to erasure regarding their own private data, even if such data has been used to train a neural network model. Such a machine unlearning problem becomes even more challenging in the context of federated learning, where clients collaborate to train a global model with their private data. When a client requests its data to be erased, its effects have already gradually permeated through a large number of clients, as the server aggregates client updates over multiple communication rounds. All of these affected clients need to participate in the retraining process, leading to prohibitive retraining costs with respect to the wall-clock training time.In this paper, we present the design and implementation of Knot, a new clustered aggregation mechanism custom-tailored to asynchronous federated learning. The design of Knot is based upon our intuition that, with asynchronous federated learning, clients can be divided into clusters, and aggregation can be performed within each cluster only so that retraining due to data erasure can be limited to within each cluster as well. To optimize client-cluster assignment, we formulated a lexicographical minimization problem that could be transformed into a linear programming problem and solved efficiently. Over a variety of datasets and tasks, we have shown clear evidence that Knot outperformed the state-of-the-art federated unlearning mechanisms by up to 85% in the context of asynchronous federated learning.
ISSN: 2641-9874 abstractTranslation: 得益于《通用数据保护条例》(GDPR) 等监管政策,为用户提供删除其私人数据的权利至关重要,即使这些数据已用于训练神经网络模型。在联邦学习的背景下,这种机器遗忘问题变得更具挑战性,在联邦学习中,客户协作使用他们的私有数据训练全局模型。当客户端请求擦除其数据时,其影响已经逐渐渗透到大量客户端,因为服务器通过多轮通信聚合客户端更新。所有这些受影响的客户都需要参与再培训过程,从而导致相对于挂钟培训时间而言,再培训成本过高。在本文中,我们介绍了 Knot 的设计和实现,这是一种新的集群聚合机制,专门针对异步联邦学习。Knot 的设计基于我们的直觉,通过异步联邦学习,客户端可以分为集群,聚合只能在每个集群内进行,这样由于数据擦除而进行的重新训练也可以限制在每个集群内。为了优化客户端集群分配,我们制定了字典最小化问题,该问题可以转换为线性规划问题并有效解决。在各种数据集和任务中,我们已经显示出明确的证据,表明在异步联邦学习的背景下,Knot 的性能优于最先进的联邦取消学习机制高达 85%。
Notes:
[PDF](https://iqua.ece.toronto.edu/papers/ningxinsu-infocom23.pdf)
[CODE](https://github.com/TL-System/plato/tree/main/examples/knot)
Authors: Angelo Rodio; Francescomaria Faticanti; Othmane Marfoq; Giovanni Neglia; Emilio Leonardi
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: The enormous amount of data produced by mobile and IoT devices has motivated the development of federated learning (FL), a framework allowing such devices (or clients) to collaboratively train machine learning models without sharing their local data. FL algorithms (like FedAvg) iteratively aggregate model updates computed by clients on their own datasets. Clients may exhibit different levels of participation, often correlated over time and with other clients. This paper presents the first convergence analysis for a FedAvg-like FL algorithm under heterogeneous and correlated client availability. Our analysis highlights how correlation adversely affects the algorithm’s convergence rate and how the aggregation strategy can alleviate this effect at the cost of steering training toward a biased model. Guided by the theoretical analysis, we propose CA-Fed, a new FL algorithm that tries to balance the conflicting goals of maximizing convergence speed and minimizing model bias. To this purpose, CA-Fed dynamically adapts the weight given to each client and may ignore clients with low availability and large correlation. Our experimental results show that CA-Fed achieves higher time-average accuracy and a lower standard deviation than state-of-the-art AdaFed and F3AST, both on synthetic and real datasets.
ISSN: 2641-9874 abstractTranslation: 移动和物联网设备产生的大量数据推动了联邦学习 (FL) 的发展,该框架允许此类设备(或客户端)协作训练机器学习模型,而无需共享本地数据。FL 算法(如 FedAvg)迭代地聚合客户端在其自己的数据集上计算的模型更新。客户可能表现出不同程度的参与,通常随着时间的推移并与其他客户相关。本文提出了异构和相关客户端可用性下类似 FedAvg FL 算法的首次收敛分析。我们的分析强调了相关性如何对算法的收敛速度产生不利影响,以及聚合策略如何以将训练转向有偏差的模型为代价来减轻这种影响。在理论分析的指导下,我们提出了 CA-Fed,一种新的 FL 算法,试图平衡最大化收敛速度和最小化模型偏差这两个相互冲突的目标。为此,CA-Fed 动态调整给予每个客户端的权重,并可能忽略可用性低且相关性大的客户端。我们的实验结果表明,无论是在合成数据集还是真实数据集上,CA-Fed 都比最先进的 AdaFed 和 F3AST 实现了更高的时间平均精度和更低的标准偏差。
Notes:
[PDF](https://arxiv.org/abs/2301.04632)]
[CODE](https://github.com/arodio/ca-fed)
Authors: Tung-Anh Nguyen; Jiayu He; Long Tan Le; Wei Bao; Nguyen H. Tran
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: In the era of Internet of Things (IoT), network-wide anomaly detection is a crucial part of monitoring IoT networks due to the inherent security vulnerabilities of most IoT devices. Principal Components Analysis (PCA) has been proposed to separate network traffics into two disjoint subspaces corresponding to normal and malicious behaviors for anomaly detection. However, the privacy concerns and limitations of devices’ computing resources compromise the practical effectiveness of PCA. We propose a federated PCA learning using Grassmann manifold optimization, which coordinates IoT devices to aggregate a joint profile of normal network behaviors for anomaly detection. First, we introduce a privacy-preserving federated PCA framework to simultaneously capture the profile of various IoT devices’ traffic. Then, we investigate the alternating direction method of multipliers gradient-based learning on the Grassmann manifold to guarantee fast training and low detecting latency with limited computational resources. Finally, we show that the computational complexity of the Grassmann manifold-based algorithm is satisfactory for hardware-constrained IoT devices. Empirical results on the NSL-KDD dataset demonstrate that our method outperforms baseline approaches.
ISSN: 2641-9874 abstractTranslation: 在物联网(IoT)时代,由于大多数物联网设备固有的安全漏洞,全网异常检测是监控物联网网络的重要组成部分。主成分分析(PCA)被提出来将网络流量分成两个不相交的子空间,对应于正常和恶意行为,以进行异常检测。然而,隐私问题和设备计算资源的限制损害了 PCA 的实际有效性。我们提出了一种使用 Grassmann 流形优化的联邦 PCA 学习,它协调 IoT 设备来聚合正常网络行为的联邦配置文件以进行异常检测。首先,我们引入了一个保护隐私的联邦 PCA 框架,以同时捕获各种物联网设备流量的概况。然后,我们研究了格拉斯曼流形上基于乘子梯度的学习的交替方向方法,以保证在有限的计算资源下快速训练和低检测延迟。最后,我们证明基于格拉斯曼流形的算法的计算复杂度对于硬件受限的物联网设备来说是令人满意的。NSL-KDD 数据集的实证结果表明我们的方法优于基线方法。
Authors: Heting Liu; Fang He; Guohong Cao
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning (FL) enables geographically dispersed edge devices (i.e., clients) to learn a global model without sharing the local datasets, where each client performs gradient descent with its local data and uploads the gradients to a central server to update the global model. However, FL faces massive communication overhead resulted from uploading the gradients in each training round. To address this problem, most existing research compresses the gradients with fixed and unified quantization for all the clients, which neither seeks adaptive quantization due to the varying gradient norms at different rounds, nor exploits the heterogeneity of the clients to accelerate FL. In this paper, we propose a novel adaptive and heterogeneous gradient quantization algorithm (AdaGQ) for FL to minimize the wall-clock training time from two aspects: i) adaptive quantization which exploits the change of gradient norm to adjust the quantization resolution in each training round; and ii) heterogeneous quantization which assigns lower quantization resolution to slow clients to align their training time with other clients to mitigate the communication bottleneck, and higher quantization resolution to fast clients to achieve a better communication efficiency and accuracy tradeoff. Evaluations based on various models and datasets validate the benefits of AdaGQ, reducing the total training time by up to 52.1% compared to baseline algorithms (e.g., FedAvg, QSGD).
ISSN: 2641-9874 abstractTranslation: 联邦学习(FL)使地理上分散的边缘设备(即客户端)能够在不共享本地数据集的情况下学习全局模型,其中每个客户端使用其本地数据执行梯度下降并将梯度上传到中央服务器以更新全局模型。然而,FL 面临着由于在每轮训练中上传梯度而导致的巨大通信开销。为了解决这个问题,大多数现有研究对所有客户端采用固定且统一的量化来压缩梯度,既没有由于不同轮次的梯度范数不同而寻求自适应量化,也没有利用客户端的异质性来加速FL。在本文中,我们提出了一种新颖的 FL 自适应异构梯度量化算法(AdaGQ),从两个方面最大限度地减少挂钟训练时间:i)自适应量化,利用梯度范数的变化来调整每次训练中的量化分辨率圆形的; ii) 异构量化,为慢速客户端分配较低的量化分辨率,以使其训练时间与其他客户端保持一致,以减轻通信瓶颈,为快速客户端分配较高的量化分辨率,以实现更好的通信效率和准确性权衡。基于各种模型和数据集的评估验证了 AdaGQ 的优势,与基线算法(例如 FedAvg、QSGD)相比,总训练时间减少高达 52.1%。
Authors: Yunming Liao; Yang Xu; Hongli Xu; Lun Wang; Chen Qian
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Data generated at the network edge can be processed locally by leveraging the paradigm of edge computing (EC). Aided by EC, decentralized federated learning (DFL), which overcomes the single-point-of-failure problem in the parameter server (PS) based federated learning, is becoming a practical and popular approach for machine learning over distributed data. However, DFL faces two critical challenges, i.e., system heterogeneity and statistical heterogeneity introduced by edge devices. To ensure fast convergence with the existence of slow edge devices, we present an efficient DFL method, termed FedHP, which integrates adaptive control of both local updating frequency and network topology to better support the heterogeneous participants. We establish a theoretical relationship between local updating frequency and network topology regarding model training performance and obtain a convergence upper bound. Upon this, we propose an optimization algorithm, that adaptively determines local updating frequencies and constructs the network topology, so as to speed up convergence and improve the model accuracy. Evaluation results show that the proposed FedHP can reduce the completion time by about 51% and improve model accuracy by at least 5% in heterogeneous scenarios, compared with the baselines.
ISSN: 2641-9874 abstractTranslation: 网络边缘生成的数据可以利用边缘计算(EC)范式在本地进行处理。在EC的帮助下,去中心化联邦学习(DFL)克服了基于参数服务器(PS)的联邦学习中的单点故障问题,正在成为一种实用且流行的分布式数据机器学习方法。然而,DFL 面临两个关键挑战,即边缘设备引入的系统异构性和统计异构性。为了确保在慢速边缘设备存在的情况下快速收敛,我们提出了一种高效的 DFL 方法,称为 FedHP,它集成了本地更新频率和网络拓扑的自适应控制,以更好地支持异构参与者。我们建立了关于模型训练性能的局部更新频率和网络拓扑之间的理论关系,并获得了收敛上限。在此基础上,我们提出了一种优化算法,自适应地确定局部更新频率并构建网络拓扑,从而加快收敛速度并提高模型精度。评估结果表明,与基线相比,所提出的 FedHP 在异构场景下可以减少约 51% 的完成时间,并提高至少 5% 的模型精度。
Authors: Peichun Li; Guoliang Cheng; Xumin Huang; Jiawen Kang; Rong Yu; Yuan Wu; Miao Pan
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: In this work, we investigate the challenging problem of on-demand federated learning (FL) over heterogeneous edge devices with diverse resource constraints. We propose a cost-adjustable FL framework, named AnycostFL, that enables diverse edge devices to efficiently perform local updates under a wide range of efficiency constraints. To this end, we design the model shrinking to support local model training with elastic computation cost, and the gradient compression to allow parameter transmission with dynamic communication overhead. An enhanced parameter aggregation is conducted in an element-wise manner to improve the model performance. Focusing on AnycostFL, we further propose an optimization design to minimize the global training loss with personalized latency and energy constraints. By revealing the theoretical insights of the convergence analysis, personalized training strategies are deduced for different devices to match their locally available resources. Experiment results indicate that, when compared to the state-of-the-art efficient FL algorithms, our learning framework can reduce up to 1.9 times of the training latency and energy consumption for realizing a reasonable global testing accuracy. Moreover, the results also demonstrate that, our approach significantly improves the converged global accuracy.
ISSN: 2641-9874 abstractTranslation: 在这项工作中,我们研究了在具有不同资源限制的异构边缘设备上进行按需联邦学习(FL)的挑战性问题。我们提出了一种成本可调整的 FL 框架,名为 AnycostFL,它使不同的边缘设备能够在广泛的效率约束下有效地执行本地更新。为此,我们设计了模型收缩以支持具有弹性计算成本的本地模型训练,并设计了梯度压缩以允许具有动态通信开销的参数传输。以元素方式进行增强的参数聚合以提高模型性能。针对 AnycostFL,我们进一步提出了一种优化设计,通过个性化的延迟和能量约束来最小化全局训练损失。通过揭示收敛分析的理论见解,为不同设备推导出个性化训练策略,以匹配其本地可用资源。实验结果表明,与最先进的高效 FL 算法相比,我们的学习框架可以减少多达 1.9 倍的训练延迟和能耗,从而实现合理的全局测试精度。此外,结果还表明,我们的方法显着提高了收敛的全局精度。
Authors: Anran Li; Hongyi Peng; Lan Zhang; Jiahui Huang; Qing Guo; Han Yu; Yang Liu
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Vertical Federated Learning (VFL) enables multiple data owners, each holding a different subset of features about largely overlapping sets of data sample(s), to jointly train a useful global model. Feature selection (FS) is important to VFL. It is still an open research problem as existing FS works designed for VFL either assumes prior knowledge on the number of noisy features or prior knowledge on the post-training threshold of useful features to be selected, making them unsuitable for practical applications. To bridge this gap, we propose the Federated Stochastic Dual-Gate based Feature Selection (FedSDG-FS) approach. It consists of a Gaussian stochastic dual-gate to efficiently approximate the probability of a feature being selected, with privacy protection through Partially Homomorphic Encryption without a trusted third-party. To reduce overhead, we propose a feature importance initialization method based on Gini impurity, which can accomplish its goals with only two parameter transmissions between the server and the clients. Extensive experiments on both synthetic and real-world datasets show that FedSDG-FS significantly outperforms existing approaches in terms of achieving accurate selection of high-quality features as well as building global models with improved performance.
ISSN: 2641-9874 abstractTranslation: 纵向联邦学习 (VFL) 使多个数据所有者能够联邦训练有用的全局模型,每个数据所有者都拥有关于大部分重叠的数据样本集的不同特征子集。特征选择(FS)对于 VFL 很重要。这仍然是一个开放的研究问题,因为现有的为 VFL 设计的 FS 工作要么假设关于噪声特征数量的先验知识,要么假设关于要选择的有用特征的训练后阈值的先验知识,使得它们不适合实际应用。为了弥补这一差距,我们提出了基于联邦随机双门的特征选择(FedSDG-FS)方法。它由高斯随机双门组成,可有效近似选择特征的概率,并通过部分同态加密提供隐私保护,无需可信第三方。为了减少开销,我们提出了一种基于基尼杂质的特征重要性初始化方法,该方法只需在服务器和客户端之间传输两个参数即可实现其目标。对合成数据集和真实数据集的大量实验表明,FedSDG-FS 在实现高质量特征的准确选择以及构建性能改进的全局模型方面显着优于现有方法。
Network Adaptive Federated Learning: Congestion and Lossy Compression
Authors: Parikshit Hegde; Gustavo de Veciana; Aryan Mokhtari
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: In order to achieve the dual goals of privacy and learning across distributed data, Federated Learning (FL) systems rely on frequent exchanges of large files (model updates) between a set of clients and the server. As such FL systems are exposed to, or indeed the cause of, congestion across a wide set of network resources. Lossy compression can be used to reduce the size of exchanged files and associated delays, at the cost of adding noise to model updates. By judiciously adapting clients’ compression to varying network congestion, an FL application can reduce wall clock training time. To that end, we propose a Network Adaptive Compression (NAC-FL) policy, which dynamically varies the client’s lossy compression choices to network congestion variations. We prove, under appropriate assumptions, that NAC-FL is asymptotically optimal in terms of directly minimizing the expected wall clock training time. Further, we show via simulation that NAC-FL achieves robust performance improvements with higher gains in settings with positively correlated delays across time.
ISSN: 2641-9874 abstractTranslation: 为了实现跨分布式数据的隐私和学习的双重目标,联邦学习(FL)系统依赖于一组客户端和服务器之间频繁交换大文件(模型更新)。因此,FL 系统面临大量网络资源的拥塞,或者实际上是拥塞的原因。有损压缩可用于减少交换文件的大小和相关延迟,但代价是增加模型更新的噪声。通过明智地调整客户端压缩以适应不同的网络拥塞情况,FL 应用程序可以减少挂钟训练时间。为此,我们提出了网络自适应压缩(NAC-FL)策略,该策略根据网络拥塞变化动态改变客户端的有损压缩选择。我们证明,在适当的假设下,NAC-FL 在直接最小化预期挂钟训练时间方面是渐近最优的。此外,我们通过仿真表明,NAC-FL 实现了稳健的性能改进,在延迟随时间呈正相关的设置中获得了更高的增益。
Authors: Dong-Jun Han; Do-Yeon Kim; Minseok Choi; Christopher G. Brinton; Jaekyun Moon
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: A fundamental challenge to providing edge-AI services is the need for a machine learning (ML) model that achieves personalization (i.e., to individual clients) and generalization (i.e., to unseen data) properties concurrently. Existing techniques in federated learning (FL) have encountered a steep tradeoff between these objectives and impose large computational requirements on edge devices during training and inference. In this paper, we propose SplitGP, a new split learning solution that can simultaneously capture generalization and personalization capabilities for efficient inference across resource-constrained clients (e.g., mobile/IoT devices). Our key idea is to split the full ML model into client-side and server-side components, and impose different roles to them: the client-side model is trained to have strong personalization capability optimized to each client’s main task, while the server-side model is trained to have strong generalization capability for handling all clients’ out-of-distribution tasks. We analytically characterize the convergence behavior of SplitGP, revealing that all client models approach stationary points asymptotically. Further, we analyze the inference time in SplitGP and provide bounds for determining model split ratios. Experimental results show that SplitGP outperforms existing baselines by wide margins in inference time and test accuracy for varying amounts of out-of-distribution samples.
ISSN: 2641-9874 abstractTranslation: 提供边缘人工智能服务的一个基本挑战是需要一个机器学习(ML)模型来同时实现个性化(即针对个人客户)和泛化(即针对看不见的数据)属性。联邦学习 (FL) 中的现有技术在这些目标之间遇到了严重的权衡,并且在训练和推理过程中对边缘设备提出了大量的计算要求。在本文中,我们提出了 SplitGP,这是一种新的分割学习解决方案,可以同时捕获泛化和个性化功能,以便跨资源受限的客户端(例如移动/物联网设备)进行高效推理。我们的关键思想是将完整的机器学习模型分为客户端和服务器端组件,并赋予它们不同的角色:客户端模型经过训练,具有针对每个客户端的主要任务进行优化的强大个性化能力,而服务器端模型侧模型经过训练,具有强大的泛化能力,可以处理所有客户的分布外任务。我们分析地描述了 SplitGP 的收敛行为,揭示了所有客户端模型都渐近地逼近驻点。此外,我们分析了 SplitGP 中的推理时间,并提供了确定模型分割比率的界限。实验结果表明,对于不同数量的分布外样本,SplitGP 在推理时间和测试准确性方面大幅优于现有基线。
Authors: Yixuan Guan; Xuefeng Liu; Tao Ren; Jianwei Niu
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning (FL) trains a shared global model by periodically aggregating gradients from local devices. Communication overhead becomes a principal bottleneck in FL since participating devices usually suffer from limited bandwidth and unreliable connections in uplink transmission. To address this problem, the gradient compression methods based on compressed sensing (CS) theory have been put forward recently. However, most existing CS-based works compress gradients independently, ignoring the gradient correlations between participants or adjacent communication rounds, which constrains the achievement of higher compression rates. In view of the above observation, we propose a novel gradient compression scheme named FedDCS, guided by distributed compressed sensing (DCS) theory. Following the design philosophy of separate encoding and joint decoding in DCS, FedDCS compresses gradients for participants in each round separately while reconstructing them at the central server jointly via fully exploiting correlated gradients from the previous round, which are known as side information (SI). Benefiting from this design, reconstruction performance is significantly improved with fewer decoding errors also iterations under the identical compression rate, and the total uploading bits to achieve model convergence are considerably reduced. Theoretical analysis and extensive experiments conducted on MNIST and Fashion-MNIST both verify the effectiveness of our approach.
ISSN: 2641-9874 abstractTranslation: 联邦学习 (FL) 通过定期聚合本地设备的梯度来训练共享的全局模型。通信开销成为 FL 的主要瓶颈,因为参与设备在上行链路传输中通常会遇到带宽有限和连接不可靠的问题。为了解决这个问题,最近提出了基于压缩感知(CS)理论的梯度压缩方法。然而,大多数现有的基于CS的工作独立地压缩梯度,忽略了参与者或相邻通信轮之间的梯度相关性,这限制了更高压缩率的实现。鉴于上述观察,我们提出了一种新颖的梯度压缩方案,名为 FedDCS,以分布式压缩感知(DCS)理论为指导。遵循 DCS 中单独编码和联邦解码的设计理念,FedDCS 分别压缩每轮参与者的梯度,同时通过充分利用上一轮的相关梯度(称为边信息(SI))在中央服务器上联邦重建它们。受益于这种设计,重建性能显着提高,相同压缩率下的解码错误和迭代次数更少,并且实现模型收敛的总上传比特数显着减少。在 MNIST 和 Fashion-MNIST 上进行的理论分析和广泛的实验都验证了我们方法的有效性。
Authors: Ningning Ding; Lin Gao; Jianwei Huang
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning protects users’ data privacy though sharing users’ local model parameters (instead of raw data) with a server. However, when massive users train a large machine learning model through federated learning, the dynamically varying and often heavy communication overhead can put significant pressure on the network operator. The operator may choose to dynamically change the network prices in response, which will eventually affect the payoffs of the server and users. This paper considers the under-explored yet important issue of the joint design of participation incentives (for encouraging users’ contribution to federated learning) and network pricing (for managing network resources). Due to heterogeneous users’ private information and multi-dimensional decisions, the optimization problems in Stage I of multi-stage games are non-convex. Nevertheless, we are able to analytically derive the corresponding optimal contract and pricing mechanism through proper transformations of constraints, variables, and functions, under both vertical and horizontal interaction structures of the participants. We show that the vertical structure is better than the horizontal one, as it avoids the interests misalignment between the server and the network operator. Numerical results based on real-world datasets show that our proposed mechanisms decrease server’s cost by up to 24.87% comparing with the state-of-the-art benchmarks.
ISSN: 2641-9874 abstractTranslation: 联邦学习通过与服务器共享用户的本地模型参数(而不是原始数据)来保护用户的数据隐私。然而,当大量用户通过联邦学习训练大型机器学习模型时,动态变化且通常很重的通信开销会给网络运营商带来巨大压力。运营商可能会选择动态改变网络价格作为响应,这最终将影响服务器和用户的收益。本文考虑了参与激励(用于鼓励用户对联邦学习做出贡献)和网络定价(用于管理网络资源)的联邦设计这一尚未探索但重要的问题。由于异构用户的隐私信息和多维决策,多阶段博弈第一阶段的优化问题是非凸的。尽管如此,我们仍然能够在参与者的纵向和横向交互结构下,通过对约束、变量和函数的适当变换,分析得出相应的最优契约和定价机制。我们表明纵向结构比水平结构更好,因为它避免了服务器和网络运营商之间的利益不一致。基于真实世界数据集的数值结果表明,与最先进的基准相比,我们提出的机制可将服务器成本降低高达 24.87%。
Authors: Yongheng Deng; Ju Ren; Cheng Tang; Feng Lyu; Yang Liu; Yaoxue Zhang
Conference : IEEE INFOCOM 2023 - IEEE Conference on Computer Communications
Abstract: Federated learning (FL) enables distributed clients to collaboratively learn a shared model while keeping their raw data private. To mitigate the system heterogeneity issues of FL and overcome the resource constraints of clients, we investigate a novel paradigm in which heterogeneous clients learn uniquely designed models with different architectures, and transfer knowledge to the server to train a larger server model that in turn helps to enhance client models. For efficient knowledge transfer between client models and server model, we propose FedHKT, a Hierarchical Knowledge Transfer framework for FL. The main idea of FedHKT is to allow clients with similar data distributions to collaboratively learn to specialize in certain classes, then the specialized knowledge of clients is aggregated to a super knowledge covering all specialties to train the server model, and finally the server model knowledge is distilled to client models. Specifically, we tailor a hybrid knowledge transfer mechanism for FedHKT, where the model parameters based and knowledge distillation (KD) based methods are respectively used for client-edge and edge-cloud knowledge transfer, which can harness the pros and evade the cons of these two approaches in learning performance and resource efficiency. Besides, to efficiently aggregate knowledge for conducive server model training, we propose a weighted ensemble distillation scheme with server-assisted knowledge selection, which aggregates knowledge by its prediction confidence, selects qualified knowledge during server model training, and uses selected knowledge to help improve client models. Extensive experiments demonstrate the superior performance of FedHKT compared to state-of-the-art baselines.
ISSN: 2641-9874 abstractTranslation: 联邦学习 (FL) 使分布式客户端能够协作学习共享模型,同时保持原始数据的私密性。为了减轻 FL 的系统异构性问题并克服客户端的资源限制,我们研究了一种新颖的范例,其中异构客户端学习具有不同架构的独特设计的模型,并将知识传输到服务器以训练更大的服务器模型,从而有助于增强客户模型。为了在客户端模型和服务器模型之间进行有效的知识转移,我们提出了 FedHKT,一种用于 FL 的分层知识转移框架。FedHKT的主要思想是让具有相似数据分布的客户协作学习某类专业,然后将客户的专业知识聚合成覆盖所有专业的超级知识来训练服务器模型,最终得到服务器模型知识提炼为客户模型。具体来说,我们为FedHKT定制了一种混合知识转移机制,其中基于模型参数和基于知识蒸馏(KD)的方法分别用于客户端边缘和边缘云知识转移,可以扬长避短。学习绩效和资源效率的两种方法。此外,为了有效地聚合知识以有利于服务器模型训练,我们提出了一种具有服务器辅助知识选择的加权集成蒸馏方案,该方案通过预测置信度聚合知识,在服务器模型训练期间选择合格的知识,并使用选定的知识来帮助改进客户端楷模。大量实验证明了 FedHKT 与最先进的基线相比具有卓越的性能。
作者: 白小鱼(上海交通大学计算机系博士生)
分享仅供学习参考,若有不当,请联系我们处理。
END
热
1.笔记分享|组队学习密码学(5)— 密码数学基础:初等数论
文推3.论文分享|基于Vector OLE构造的恶意安全的PSI协议(VOLE-PSI)
荐